Nemesis Compression
Nemesis compression is a very dense type of graphic compression. It is so named because the first person to write a decompressor for this format had a screenname of Nemesis. Think of it as a hybrid of RLE and Huffman coding. That's why this type of compression is highly efficient in most cases. Format Each Nemesis-compressed tileset ("archive") has three sections: A header, a binary tree, and of course, the data itself. 'Header' The first two bytes represent the number of tiles in the archive. The decompressed data will always be a multiple of 32 bytes, which makes it suitable for graphics. Bit 15 of this value is a flag; it tells whether to decompress the tiles in Normal mode (if it's 0) or XOR mode (1). In XOR mode, after each row of pixels is decompressed, it is XOR'ed with the previous row. In Normal mode, each row is just decompressed as-is. Compressors for Nemesis compression will usually choose whichever format gives a smaller file size, although some compressors might give the option of whether to use a certain mode, or just use whichever mode gives a smaller size. 'Tree' Following the header is a series of bytes that define a binary tree. Each leaf of the tree stands for a run of pixels. For each run, the length can be anywhere from 1-8 pixels, and the pixel value (color) can range from $0-$F. All runs of the same pixel value are grouped together. To read the tree: *If the byte is $FF, it's the end of tree definition. *If the byte has bit 7 set ($80-$FF), all runs following will be of the pixel value determined by the last 4 bits of this byte. *If the byte has bit 7 clear, it's the first of two bytes that define a code: **In the first byte, bits 6-4 tell the length of pixels (0-7 for 1-8 pixels), and bits 3-0 tell the length of the code (1-8). **The second byte is for the code itself. The code uses as many of the least significant bits as necessary, so the code is right-aligned to the byte. Note the following restrictions to codes: *Codes can range from 1-8 bits long. *Codes that would start with six 1's in a row are reserved for inline data. For better decompression performance, the binary tree is transformed into a simple lookup table. This really ''cuts down on the decompression time. 'Data''' The data section consists of codes that were defined in the tree. If a code is in the tree, it represents the corresponding run of pixels to the decompressed data. However, a code that starts off with %111111 means that inline data is next. Inline data is 7 bits long. The first 3 bits represent the length (0-7 for 1-8 pixels), and the next 4 bits represent the pixel value ($0-$F). All data is bitpacked, and it doesn't stop until as many tiles' worth of data as represented in the header is stored. Assembly Code NemDec: ; ; INPUT: a0 = POINTER: Source ; a4 = POINTER: Destination ; movem.l d0-a1/a3-a5,-(sp) lea NemDec_WriteAndStay(pc),a3 lea $C00000,a4 bra.b NemDec_COMMON ;---------------------------------------------------------------------- NemDec_to_RAM: movem.l d0-a1/a3-a5,-(sp) lea NemDec_WriteAndAdvance(pc),a3 ;---------------------------------------------------------------------- NemDec_COMMON: move.w #$2700,sr lea __NemDec_Tree,a1 move.b (a0)+,d2 lsl.w #8,d2 move.b (a0)+,d2 add.w d2,d2 bcc.b NemDec0 adda.w #NemDec_WriteAndStay_XOR-NemDec_WriteAndStay,a3 NemDec0: lsl.w #2,d2 move.w d2,a5 moveq #8,d3 moveq #0,d2 moveq #0,d4 bsr.w NemDecPrepare move.b (a0)+,d5 asl.w #8,d5 move.b (a0)+,d5 moveq #$10,d6 bsr.b NemDecRun move.w #$2000,sr movem.l (sp)+,d0-a1/a3-a5 rts ; NemDecRun: move.w d6,d7 subq.w #8,d7 move.w d5,d1 lsr.w d7,d1 cmpi.b #$FC,d1 bcc.b W2 andi.w #$FF,d1 add.w d1,d1 move.b 0(a1,d1.w),d0 ext.w d0 sub.w d0,d6 cmpi.w #9,d6 bcc.b NemDec1 addq.w #8,d6 asl.w #8,d5 move.b (a0)+,d5 NemDec1: move.b 1(a1,d1.w),d1 move.w d1,d0 andi.w #$0F,d1 andi.w #$F0,d0 Part2: lsr.w #4,d0 Part3: lsl.l #4,d4 or.b d1,d4 subq.w #1,d3 bne.b NemDec_WriteIter_Part2 jmp (a3) ;---------------------------------------------------------------------- NemDec_WriteIter: moveq #0,d4 moveq #8,d3 ;---------------------------------------------------------------------- NemDec_WriteIter_Part2: dbra d0,Part3 bra.b NemDecRun ;---------------------------------------------------------------------- W2: subq.w #6,d6 cmpi.w #9,d6 bcc.b NemDec2 addq.w #8,d6 asl.w #8,d5 move.b (a0)+,d5 NemDec2: subq.w #7,d6 move.w d5,d1 lsr.w d6,d1 move.w d1,d0 andi.w #$0F,d1 andi.w #$70,d0 cmpi.w #9,d6 bcc.b Part2 addq.w #8,d6 asl.w #8,d5 move.b (a0)+,d5 bra.b Part2 ;---------------------------------------------------------------------- NemDec_WriteAndStay: move.l d4,(a4) subq.w #1,a5 move.w a5,d4 bne.s NemDec_WriteIter rts ;---------------------------------------------------------------------- NemDec_WriteAndStay_XOR: eor.l d4,d2 move.l d2,(a4) subq.w #1,a5 move.w a5,d4 bne.s NemDec_WriteIter rts ;---------------------------------------------------------------------- NemDec_WriteAndAdvance: move.l d4,(a4)+ subq.w #1,a5 move.w a5,d4 bne.s NemDec_WriteIter rts ;---------------------------------------------------------------------- NemDec_WriteAndAdvance_XOR: eor.l d4,d2 move.l d2,(a4)+ subq.w #1,a5 move.w a5,d4 bne.s NemDec_WriteIter rts ;---------------------------------------------------------------------- NemDecPrepare: move.b (a0)+,d0 NemDecPrepare2: cmpi.b #$FF,d0 bne.b NemDec4 rts NemDec4: move.w d0,d7 Pr2: move.b (a0)+,d0 bmi.b NemDecPrepare2 move.b d0,d1 andi.w #$0F,d7 andi.w #$70,d1 or.w d1,d7 andi.w #$F,d0 move.b d0,d1 lsl.w #8,d1 or.w d1,d7 moveq #8,d1 sub.w d0,d1 bne.b Pr3 move.b (a0)+,d0 add.w d0,d0 move.w d7,(a1,d0.w) bra.b Pr2 ;---------------------------------------------------------------------- Pr3: move.b (a0)+,d0 lsl.w d1,d0 add.w d0,d0 moveq #1,d5 lsl.w d1,d5 subq.w #1,d5 NemDec5: move.w d7,(a1,d0.w) addq.w #2,d0 dbra d5,NemDec5 bra.b Pr2 The code is 324 bytes long, which is equivalent to 10.125 tiles.